Opaque Control-Flow Integrity
نویسندگان
چکیده
A new binary software randomization and ControlFlow Integrity (CFI) enforcement system is presented, which is the first to efficiently resist code-reuse attacks launched by informed adversaries who possess full knowledge of the inmemory code layout of victim programs. The defense mitigates a recent wave of implementation disclosure attacks, by which adversaries can exfiltrate in-memory code details in order to prepare code-reuse attacks (e.g., Return-Oriented Programming (ROP) attacks) that bypass fine-grained randomization defenses. Such implementation-aware attacks defeat traditional fine-grained randomization by undermining its assumption that the randomized locations of abusable code gadgets remain secret. Opaque CFI (O-CFI) overcomes this weakness through a novel combination of fine-grained code-randomization and coarsegrained control-flow integrity checking. It conceals the graph of hijackable control-flow edges even from attackers who can view the complete stack, heap, and binary code of the victim process. For maximal efficiency, the integrity checks are implemented using instructions that will soon be hardware-accelerated on commodity x86-x64 processors. The approach is highly practical since it does not require a modified compiler and can protect legacy binaries without access to source code. Experiments using our fully functional prototype implementation show that O-CFI provides significant probabilistic protection against ROP attacks launched by adversaries with complete code layout knowledge, and exhibits only 4.7% mean performance overhead on current hardware (with further overhead reductions to follow on forthcoming Intel processors). I. MOTIVATION Code-reuse attacks (cf., [5]) have become a mainstay of software exploitation over the past several years, due to the rise of data execution protections that nullify traditional codeinjection attacks. Rather than injecting malicious payload code directly onto the stack or heap, where modern data execution protections block it from being executed, attackers now ingeniously inject addresses of existing in-memory code fragments (gadgets) onto victim stacks, causing the victim process to execute its own binary code in an unanticipated order [38]. With a sufficiently large victim code section, the pool of exploitable gadgets becomes arbitrarily expressive (e.g., Turing-complete) [20], facilitating the construction of arbitrary attack payloads without the need for code-injection. Such payload construction has even been automated [34]. As a result, code-reuse has largely replaced code-injection as one of the top software security threats. Permission to freely reproduce all or part of this paper for noncommercial purposes is granted provided that copies bear this notice and the full citation on the first page. Reproduction for commercial purposes is strictly prohibited without the prior written consent of the Internet Society, the first-named author (for reproduction of an entire paper only), and the author’s employer if the paper was prepared within the scope of employment. NDSS ’15, 8–11 February 2015, San Diego, CA, USA Copyright 2015 Internet Society, ISBN 1-891562-38-X http://dx.doi.org/10.14722/ndss.2015.23271 This has motivated copious work on defenses against codereuse threats. Prior defenses can generally be categorized into: CFI [1] and artificial software diversity [8]. CFI restricts all of a program’s runtime control-flows to a graph of whitelisted control-flow edges. Usually the graph is derived from the semantics of the program source code or a conservative disassembly of its binary code. As a result, CFIprotected programs reject control-flow hijacks that attempt to traverse edges not supported by the original program’s semantics. Fine-grained CFI monitors indirect control-flows precisely; for example, function callees must return to their exact callers. Although such precision provides the highest security, it also tends to incur high performance overheads (e.g., 21% for precise caller-callee return-matching [1]). Because this overhead is often too high for industry adoption, researchers have proposed many optimized, coarser-grained variants of CFI. Coarse-grained CFI trades some security for better performance by reducing the precision of the checks. For example, functions must return to valid call sites (but not necessarily to the particular site that invoked the callee). Unfortunately, such relaxations have proved dangerous—a number of recent proof-of-concept exploits have shown how even minor relaxations of the control-flow policy can be exploited to effect attacks [6, 11, 18, 19]. Table I summarizes the impact of several of these recent exploits. Artificial software diversity offers a different but complementary approach that randomizes programs in such a way that attacks succeeding against one program instance have a very low probability of success against other (independently randomized) instances of the same program. Probabilistic defenses rely on memory secrecy—i.e., the effects of randomization must remain hidden from attackers. One of the simplest and most widely adopted forms of artificial diversity is Address Space Layout Randomization (ASLR), which randomizes the base addresses of program segments at loadtime. Unfortunately, merely randomizing the base addresses does not yield sufficient entropy to preserve memory secrecy in many cases; there are numerous successful derandomization attacks against ASLR [13, 26, 36, 37, 39, 42]. Finer-grained diversity techniques obtain exponentially higher entropy by randomizing the relative distances between all code points. For example, binary-level Self-Transforming Instruction Relocation (STIR) [45] and compilers with randomized code-generation (e.g., [22]) have both realized fine-grained artificial diversity for production-level software at very low overheads. Recently, a new wave of implementation disclosure attacks [4, 10, 35, 40] have threatened to undermine fine-grained artificial diversity defenses. Implementation disclosure attacks exploit information leak vulnerabilities to read memory pages of victim processes at the discretion of the attacker. By reading the TABLE I. OVERVIEW OF CONTROL-FLOW INTEGRITY BYPASSES CFI [1] bin-CFI [50] CCFIR [49] kBouncer [33] ROPecker [7] ROPGuard [16] EMET [30] DeMott [12] Feb 2014 / Göktaş et al. [18] May 2014 / / / Davi et al. [11] Aug 2014 / / / / / Göktaş et al. [19] Aug 2014 / / Carlini and Wagner [6] Aug 2014 / / in-memory code sections, attackers violate the memory secrecy assumptions of artificial diversity, rendering their defenses ineffective. Since finding and closing all information leaks is well known to be prohibitively difficult and often intractable for many large software products, these attacks constitute a very dangerous development in the cyber-threat landscape; there is currently no well-established, practical defense. This paper presents Opaque CFI (O-CFI): a new approach to coarse-grained CFI that strengthens fine-grained artificial diversity to withstand implementation disclosure attacks. The heart of O-CFI is a new form of control-flow check that conceals the graph of abusable control-flow edges even from attackers who have complete read-access to the randomized binary code, the stack, and the heap of victim processes. Such access only affords attackers knowledge of the intended (and therefore nonabusable) edges of the control-flow graph, not the edges left unprotected by the coarse-grained CFI implementation. Artificial diversification is employed to vary the set of unprotected edges between program instances, maintaining the probabilistic guarantees of fine-grained diversity. Experiments show that O-CFI enjoys performance overheads comparable to standard fine-grained diversity and non-opaque, coarse-grained CFI. Moreover, O-CFI’s control-flow checking logic is implemented using Intel x86/x64 memory-protection extensions (MPX) that are expected to be hardware-accelerated in commodity CPUs from 2015 onwards. We therefore expect even better performance for O-CFI in the near future. Our contributions are as follows: • We introduce O-CFI, the first low-overhead code-reuse defense that tolerates implementation disclosures. • We describe our implementation of a fully functional prototype that protects stripped, x86 legacy binaries without source code. • Analysis shows that O-CFI provides quantifiable security against state-of-the-art exploits—including JITROP [40] and Blind-ROP [4]. • Performance evaluation yields competitive overheads of just 4.7% for computation-intensive programs. II. THREAT MODEL Our work is motivated by the emergence of attacks against fine-grained diversity and coarse-grained control-flow integrity. We therefore introduce these attacks and distill them into a single, unified threat model. A. Bypassing Coarse-Grained CFI Ideally, CFI permits only programmer-intended control-flow transfers during a program’s execution. The typical approach is to assign a unique ID to each permissible indirect controlflow target, and check the IDs at runtime. Unfortunately, this introduces performance overhead proportional to the degree of the graph—the more overlaps between valid target sets of indirect branch instructions, the more IDs must be stored and checked at each branch. Moreover, perfect CFI cannot be realized with a purely static control-flow graph; for example, the permissible destinations of function returns depend on the calling context, which is only known at runtime. Fine-grained CFI therefore implements a dynamically computed shadow stack, incurring high overheads [1]. To avoid this, coarse-grained CFI implementations resort to a reduced-degree, static approximation of the control-flow graph, and merge identifiers at the cost of reduced security. For example, bin-CFI [49] and CCFIR [50] use at most three IDs per branch, and omit shadow stacks. Recent work has demonstrated that these optimizations open exploitable security holes. By choosing ROP gadgets that start at a function entry point or are call-preceded, it is possible to build ROP chains that bypass CFI [19], including subverting CCFIR and bin-CFI. Related works [6, 11] have similarly shown that call-preceded gadgets can bypass bin-CFI as well as other low-overhead approaches that only check control-flow transfers before potentially dangerous function calls [7, 16, 30, 33]. Table I maps coarse-grained CFI approaches to the corresponding proof-of-concept bypasses. Note that the bypass of the original CFI approach assumes that returns are not tracked precisely using a shadow stack. Just-In-Time Code Reuse. Until recently, most threat models for CFI and artificial diversity defenses assumed that the memory contents of protected processes were hidden from attackers. The advent of Just-In-Time ROP (JIT-ROP) [40] demonstrated that this assumption might be unrealistic in practice due to the existence of implementation disclosure vulnerabilities. Using heap feng shui [41], JIT-ROP places a buffer next to a string and a button object. By overflowing the buffer, the string length is set arbitrarily high, allowing the attacker to read any byte in the virtual address space. Parsing the button object through the overflowed string yields a reference to a mapped code page. Typically, attackers need more than a single 4K page worth of code to find enough gadgets to mount a code-reuse attack. To discourage brute-force searches for more code pages, artificial diversity defenses routinely mine the address space with unmapped pages that abort the process if accessed [2]. JITROP evades these mines by disassembling the initial code page and carefully traversing only direct references to other code pages to recursively discover enough gadgets to mount a ROP attack. Since gadget locations are no longer unknown to the attacker, reliable construction of custom ROP chains becomes possible despite the fine-grained randomization defense. Blind ROP. While JIT-ROP targets scripting-enabled clients, Blind Return Oriented Programming (BROP) [4] targets vulnerable Internet-facing services, such as web-servers, that restart after a crash. It capitalizes on the observation that child processes created with the fork system call on Linux must be randomized in the same way as their parent in order
منابع مشابه
Generalized Dynamic Opaque Predicates: A New Control Flow Obfuscation Method
Opaque predicate obfuscation, a low-cost and stealthy control flow obfuscation method to introduce superfluous branches, has been demonstrated to be effective to impede reverse engineering efforts and broadly used in various areas of software security. Conventional opaque predicates typically rely on the invariant property of well-known number theoretic theorems, making them easy to be detected...
متن کاملManufacturing opaque predicates in distributed systems for code obfuscation
Code obfuscation is a relatively new technique of software protection and it works by deterring reverse engineering attempts by malicious users of software. The objective of obfuscation is to make the logic embedded in code incomprehensible to automated program analysis tools used by adversaries. Opaque predicates act as tool for obfuscating control flow logic embedded within code. In this posi...
متن کاملIndistinguishable Predicates: A New Tool for Obfuscation
Opaque predicates are a commonly used technique in program obfuscation, intended to add complexity to control flow and to insert dummy code or watermarks. We survey a number of methods to remove opaque predicates from obfuscated programs, hence defeating the intentions of the obfuscator. Our main contribution is an obfuscation technique that introduces opaque constant predicates that are provab...
متن کاملP-146: Fertility and Flow Cytometric Evaluations of Frozen-Thawed Rooster Semen in Cryopreservation Medium Containing Low Density Lipoprotein
Background: Frozen-thawed rooster semen is not reliable for use in artificial insemination in commercial stocks. Low density lipoprotein (LDL) has been assessed for effectiveness as a cryoprotectant in the extender to improve the quality of frozen-thawed rooster semen. Although LDL has been evaluated in a few studies in other species for semen cryopreservation, to date, no study has been conduc...
متن کاملSeeing through the clouds: Managing data flow and compliance in cloud computing
As cloud computing becomes an increasingly dominant means of providing computing resources worldwide, legal and regulatory issues associated with the cloud also become more pronounced. In particular, there is a heightened focus on ensuring the privacy and integrity of end-users’ personal data. At present, the cloud is opaque, a black-box. The technical means for enforcing and demonstrating comp...
متن کاملRelation of capsular polysaccharide production and colonial cell organization to colony morphology in Vibrio parahaemolyticus.
Vibrio parahaemolyticus is a ubiquitous, gram-negative marine bacterium that undergoes phase variation between opaque and translucent colony morphologies. The purpose of this study was to determine the factor(s) responsible for the opaque and translucent phenotypes and to examine cell organization within both colony types. Examination of thin sections of ruthenium red-stained bacterial cells by...
متن کامل